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NEW DEVELOPMENTS IN CONVOLUTIONAL 
ENCODING AND DECODING 


Abstract 

The historical development of convolutional encoding and de- 
coding is outlined and a brief tutorial explanation of the convolu- 
tional coding concept is given. The recent progress in encoding and 
decoding is described by citing specific examples. In particular, in 
the decoding area, the application of such techniques as: sequen- 
tial decoding, threshold decoding, and maximum likelihood decoding 
is described. Some examples of actual communication systems 
that incorporate convolutional codes are given. 
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NEW DEVELOPMENTS IN CONVOLUTIONAL 
ENCODING AND DECODING 


1. INTRODUCTION 

This paper will outline the new developments in convolutional encoding and 
decoding. These developments will be introduced by a brief historical and tuto- 
rial description of convolutional codes. Finally, some applications of convolu- 
tional codes in present and future systems will be described. 


2. BRIEF TUTORIAL DESCRIPTION OF CONVOLUTIONAL CODES 

In a convolutional code the check (or parity) bits are computed as a function 
of information bits as the latter are fed into the encoder serially. The encoder 
does not recognize information words, as in the case of block codes, but only 
frames, which are serial groups of information words. 

The concept [25] of a convolutional encoder is shown in Figure 1. In this 
example, a four-stage shift register allows the check bits to be computed as a 
function of up to four information bits (corresponding to a constraint length, K, 
of 4). As shown in the figure, for each information bit shifted into the encoder, 
three output bits are generated. This corresponds to a rate l/v of one-third. In 
this particular encoder, it is noted that the first output bit is equal to the bit in 
the first stage of the register, whereas the other two output bits are modulo-2 
functions of 3 or 4 stages of the register. The code in this example is called a 
systematic code because each information bit appears unchanged in the output bit 
stream. When this is not so, and each output bit is a function of more than one 
input bit, the code is called non-systematic. 

In Figure 1 the output bit stream is shown for a five-bit input bit stream or 
frame and corresponds to 9 shifts through the shift register. That is, the first 
group of 3 output bits corresponds to the shift of the first input bit into the first 
stage, and the last group of 3 output bits corresponds to the shift of the last in- 
formation bit out of the last stage. In this example zeros are shifted into the 
shift register after the last input bit. 

For a given encoder arrangement of the type shown in Figure 1, and a given 
input frame length, a code tree can be constructed. The code tree for the en- 
coder shown in Figure 1 is shown in Figure 2. This code tree is a fundamental 
concept in convolutional codes, particularly in some of the decoding techniques. 

The tree shows the encoder output bits for each input, one or zero, at each tree node. 
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WHEN: x = (1,0,1, 1,0) ( FOR L = 5) 

y = (111, 010, 100, 110, 001, 000, 01 1, 000, 000 ) 



* From Wozencraft, J. M., and Jacobs, I. M. 
"Prinicples of Communication Engineering " 
John Wiley & Sons, Inc, N. Y., 1965, pp 410, 411 

Figure 1. A Convolutional Encoder. 
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* From: Wozencraft, J. M. and Jacobs, I. M., "Principles of Communication Engineering", 
J. Wiley, N. Y., 1965 pg.414. 


Figure 2. Convolution Code Tree for the encoder in Figure 1. 
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The code "tail" is the output bit sequence that results after the last informa- 
tion bit in the frame is shifted into the second, third, etc. stage of the shift reg- 
ister and finally out of the k-th stage. It is customary to load a sequence of 
zeros into the first stage of the shift register while the end of the information 
frame ia being shifted through, although any known bit sequence may be used. 

3. BRIEF HISTORICAL SKETCH OF CONVOLUTIONAL ENCODING AND 
DECODING 

3.1 Encoding 


The concept of convolutional coding was introduced by Elias [3] in 1955. In 
1959, Hagelbarger [9] used the name "recurrent codes" for this type of coding, 
and Wyner and Ash [26] analyzed "recurrent" codes in 1963. 

Systematic code generators were developed by Bussgang in 1965 [1], Lin 
and Lyne in 1967 [14], Forney in 1968 [5], and Costello in 1969 [2]. Non-sys- 
tematic code generators were developed by Massey and Costello in 1970 [18] and 
Jelinek and Bahl in 1970 [13] . 


3.2 Decoding 


Up to the present time, there are three basic techniques for decoding con- 
volutional codes: sequential decoding, threshold decoding, and maximum-likeli- 
hood decoding. These techniques will be described in detail in the next section, 
but the following is a historical sketch of their development. Sequential decoding 
was first introduced by Wozencraft [23, 24] in 1957. In 1963, Fano [4] published 
his sequential decoding algorithm which has since been used and analyzed by 
many other workers in this field. 

Sequential decoding was analyzed by Pinsker in 1965 [19], Savage in 1966 
120], Jacobs and Berlekamp in 1967 [ll] and Gallagher in 1968 [6], In 1969 
Jelinek [12] published his "stack" algorithm for sequential decoding which has 
received much recent interest. Geist [7] has recently compared the performance 
of the Fano and Jelinek algorithms for sequential decoding. 

Threshold decoding for convolutional codes was proposed by Massey [16] 
in 1963. 

A maximum likelihood decoding technique for convolutional codes was pro- 
posed by Viterbi [22J in 1967. 
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4. SOME NEW DEVELOPMENTS 


In the development of any new system concept, one would like to find an 
optimum system and then attempt to build systems that would approach this 
optimum system. This has not been the case with convolutional encoding and 
decoding. A convolutional code does not have an algebraic structure, and this 
has necessitated random coding arguments for theoretical studies (of bounds on 
error probability, for example) and computer simulations for development of 
encoding and decoding systems. This explains the name ’’probabilistic code" 
used in reference to a convolutional code as compared to the name "algebraic 
code" used in reference to a block code. 


4.1 Encoding 


Convolutional encoders have been divided into two types: those that generate 
systematic codes, and those that generate non-systematic codes. It is common 
practice to represent the inputs to the modulo-two adders (such as shown in 
Figure 1) in terms of generating function coefficients. 

For every convolutional encoder, we can write a set of generating function 
coefficients as follows: 

G 1 = g ll g 12 g 13 g lk 

G 2 = g 21 g 22 g 23 g 2k ^ 

G r = g rl g r2 g r3 g rk 

For example, the convolutional encoder in Figure 1 has the following three 
generating function coefficients: 


G, = 1000 
G 2 = 1 1 1 1 

G 3 = 1011 

A systematic code always has one generator function with the following 
coefficient properties: 
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For G, = g n R n g 13 g lk 


R 12 * R 13 = = R lk = ® 

As noted above, Bussgang [1], Lin and Lyne [14] and Forney [5] developed 
techniques for finding generatoi functions for systematic codes with good error- 
correction properties by maximizing the minimum distance between paths in the 
code tree. These properties were examined by means of computer simulations, 
and the resulting generator functions are identical for many constraint lengths. 
For this reason this type of systematic convolutional code has been given the 
name: Bussgang- Lin- Lyne- Forney code. 

Recently, Costello [2] has developed algorithms for generator functions for 
convolutional codes with constraint lengths longer than previously obtainable. 

In the area of non-systematic codes, there has been considerable interest 
in developing generator functions since non-systematic codes typically show 
better performance than systematic codes for the same constraint length K. One 
of the undesirable characteristics of a non-systematic code, however, is the 
fact that the input bit sequence no longer appears in the output bit stream, making 
a M quick-look M capability impossible before decoding. 

Massey and Costello [18] developed a non-systematic code which has good 
performance, and allows ’’quick look.” If D represents the unit delay in one stage 
of the encoding shift register, then the generating function may be written: 

G , (D) = R„ + k, 2 d + R i 3 ° 2 ♦ 

°r < D > = R r 1 + R r2 D + R r3 1)2 + 

Massey and Costello concentrated on a rate 1/2 non-systematic code with 


G, (D) r D + G 2 (D) (4) 

In this way, if the input bit sequence is represented by: 

I (D) = ij + i 2 D ♦ i 3 D 2 + • (a) 
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and the output sequences from the two modulo-two adders are: 


T, (D) = G, (D) I (D) 
T 2 (D) = G 2 (D) I (D) 

then: 

T, (D) + T 2 (D) = D I (D) 


( 6 ) 


( 7 ) 


Thus by adding the two-bit sequences from the two modulo-two adders, one 
obtains the original input bit stream shifted by one unit time delay. 

As can be seen from Equation 4, the two generator functions have identical 
coefficients except for the second coefficient. 


A type of non-systematic code that has shown notably good performance 
recently is the ’’complement" code of Jelinek and Bahl [13]. This code has the 
following generator properties: 

1. It is rate 1/2. 

2. The first and last coefficients of both generators are equal to one, or 
Sii = &ik = S 2 i = ^ 2k = ** 

3. The other coefficients of G, are the complements of the corresponding 
coefficients of G 2 . 


4.2 Decoding 


4.2.1 Sequential Decoding 


4.2. 1.1. Fano Algorithm. As mentioned above, the technique of sequential de- 
coding has been dominated by the Fano algorithm (4) since its publication in 
1963. The following is a brief description of ♦he Fano algori .hm. Basically, 
this algorithm searches through the code tree (such as the one shown in Figure 
2) for the path which has the best agreement in branch symbols with tho received 
sequence of symbols. This "best" agreement is determined at each branch node 
by means of a function called a "metric" which is based on the logarithm of the 
likelihood function or L. A. P. which stands for log-a-posteriori-pro’oability. 

For each symbol transmitted, y, and each symbol received, z, the symbol 
likelihood function is: 


L 


t 


P(z y) 



p ( y) P(z y) 


( 8 ) 
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By quantizing the received symbols (at the output of a demodulator, for 
example) an improvement in performance of the algorithm is obtained (representing 
an equivalent coding gain of approximately 2 db). For quantization, Jacobs [l 0 ] 
found that the difference in performance between 8 levels and an infinite number 
of levels is 0.2 db. The metric that is compared at each branch node is called 
the branch metric and is equal to the sum of the symbol metrics (which are the 
same as the log L s functions except for a bias value). The branch with the higher 
branch metric is chosen and this metric is added to the sum of previous branch 
metrics (called the cumulative branch metric) along the path being traced in the 
code tree. A check is made to see if this cumulative branch metric is greater 
than a threshold value. This threshold value is increased at each successful node 
test to bring it closer to (but always less than) the cumulative branch metric. 

When an incorrect branch has been chosen, the threshold test typically will fail, 
and the algorithm will successively start back at previous nodes, lowering the 
threshold if necessary until the threshold test is satisfied and a new forward 
path is found. 

The above forward and backward behavior of the Fano algorithm leads to 
one of the basic problems of sequential decoding, namely the variable computa- 
tion time for decoding. The computation time for each frame (corresponding to 
a complete patn through the tree) is a random variable. Various aspects of this 
random variable have been studied [11, 20] , particularly the influence of the 
channel signal-to-noise ratio. In general, the cumulative probability distribution 
of the computation time is an exponential Pareto-type function, of the form, 

P (T > t ) = K t~ p (9) 

r v c c ' c c 


where 


K is a constant 

C 

p is the Pareto exponent 

The Pareto exponent is defined here in terms of the code rate 1/v and the 
zero-rate exponent for error probability given by Gallager [6 ]. 

When p- l, the rate is called the "computation rate," or R comp . As the 
frame size L goes to infinity, the computation time per decoded digit will re- 
main finite for R < R , and will tend to infinitv for R > R_ [4], This 
variable computation time leads to situations in which the sequential decoder 
cannot keep up with the incoming data rate. In such cases, the sequential de- 
coder hits to delete or reject data and resume decoding with the new incoming 
data. Data deletion, with a corresponding deletion rate is an inherent character- 
istic of sequential decoding. 
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Since the introduction of the Fano algorithm, it has been modified slightly 
by analysts and users, particularly in the use of the threshold test [8], but it 
still represents a state-of-the-art technique in sequential decoding [21] . 

4.2. 1.2. The Jelinek algorithm [12] is another form of sequential decoding, and 
the following is a brief description of its operation. A cumulative function called 
a node value very similar to the cumulative branch metric of the Fano algorithm 
(since it also includes the likelihood function) is computed for each branch and 
ordered or "stacked" according to the following procedure: 

1. Set origin node value to zero and place it in the stack. 

2. Compute the node values of the two successor nodes of the node whose 
node value is at the top of the stack, and add them to the stack in an 
ordered manner. 

3. Delete from the stack the node value of the node whose successor nodes 
were just used. 

4. If the new top node is in the last level of the tree, stop, and select the 
path determined by this last node. If not, go back to step 2. 


It can be seen ihat the Jelinek algorithm, in effect, can look back more than 
one node at a time, in the case of noisy data; whereas the Fano algorithm looks 
back one node at a time in the same situation. However, at each node, the com- 
putation required by the Jelinek algorithm is more complex than the Fano since 
it involves a search of the entire node value stack. These tradeoffs are involved 
in the comparison of these two algorithms from the standpoint of computation 
time, and Geist [7] has made such a comparison - empirically. His results 
show almost equal computational complexity between the two algorithms, with 
the Jelinek decoder slightly faster on noisier channels. 


4.2.2. Threshold decoding [18] has been called by its author, Massey, "a simple, 
but in general, non-optimum solution to the decoding problem based on a special 
subset of parity checks." [17] There are two types of threshold decoding: 
majority decoding and APP (A Posteriori Probability) decoding. In each case, 
a set (A.) of J parity checks is used which is orthogonal on the noise bit e 0 
corresponding to the information bit being decoded. A parity check set is called 
orthogonal on a noise bit e 0 if e 0 is checked by each member of the set, but no 
other noise bit is checked by more than one member of the set. The general 
rule for threshold decoding is: choose e 0 = 1 if, and only if, 


J 

W i A i > T 

i ■ 1 



(10) 
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where 


w are weighting factors, T is the threshold 
For majority decoding, 



for APP decoding 



where p. = 1 - q . is the probability of an odd number of "ones” in the noise 
bits exclusive of e Q which are checked by A. . Compared to sequential decoding, 
threshold decoding is basically simpler to implement, but requires codes that 
have the parity check orthogonality described above. The computation time is 
a constant in the case of threshold decoding as compared to the variable com- 
putation time of sequential decoding, and there is no data deletion. 

4.2.3. The Maximum Likelihood approach to decoding of convolutional codes is 
represented by the Viterbi algorithm [22]. Theoretically, the maximum likeli- 
hood decoder would compute and compare the likelihood functions for all paths 
through the code tree, given a received frame, and choose as the correct path 
the one with the largest likelihood function. 

This would involve prohibitive computation time. The Viterbi algorithm is 
a modified version of this approach with a greatly reduced computation time. 

The Viterbi algorithm utilizes the repetitive nature of a code tree: a code 
tree repeats its structure after K nodes, where K is the constraint length. In 
operation, this algorithm computes the likelihood function of the 2 K paths from 
the origin to the Kth node in the code tree. The repetitive structure of the code 
tree allows the elimination of 2 K_1 paths at the Kth node level since beyond the 
Kth node level, the code symbols on branches emanating from certain pairs of 
branches are identical. The 2 K-1 paths selected on the basis of maximum 
likelihood are called the "survivors.” In the next step the likelihood functions 
of the 2 K paths from the origin to the (K + l) th node are computed and 2 K_1 paths 
are eliminated as above. In each step, the algorithm computes 2 K likelihood 
functions since 2 K_1 branches are added at the (K + l) th node to the 2 K ' 1 survivor 
paths. The decoder considers 2 K_1 candidate paths as it moves through the 
tree, and finally selects one path at the end of the tree. This selection is really 
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a "narrowing-down” process which starts at the Lth node((see Figure 2). At 
this tree level, zeros as fed into the encoder shift register and branching ceases - 
the code "tail" is generated. With ear-h zero encoder input, a decision is made 
in the decoder algorithm, reducing the number of candidate paths by a factor of 
two. Since the code tail is K levels long, then by the time the decoder reaches 
the last level, it has reduced the 2 K ' ! candidate paths by a factor of 2 K *, or in 
other words, selected one path through the tree. 


5. SOME APPLICATIONS OF CONVOLUTIONAL CODES 

Because of their high coding gain, convolutional codes have been proposed 
for use in deep space communication [10]. 

A convolutional code was used on the PIONEER IX spacecraft launched on 
November 8, 1968 [15] into a heliocentric orbit. The code was a systematic, 
rate 1/2 code with a constraint length of 25 bits and a frame length of 224 bits. 
The decoder was a Fano algorithm type programmed for a medium size computer 
(SDS 920). 

Four future NASA spacecraft will use convolutional codes: IMP (Interplan- 
etary Monitoring Platform) H, I and J; and RAE (Radioastronomy Explorer) B. 

The RAE-B spacecraft will be in lunar orbit. 

It is planned to use a convolutional code on the German-U.S. -HELIOS 
spacecraft which will be launched in 1974. 
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